-
Notifications
You must be signed in to change notification settings - Fork 1.1k
feat: add cpu/cuda config for prompt guard #2194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Previously prompt guard was hard coded to require cuda which prevented it from being used on an instance without a cuda support. This PR allows prompt guard to be configured to use either cpu or cuda. Signed-off-by: Michael Dawson <[email protected]>
@@ -75,7 +75,7 @@ def __init__( | |||
self.temperature = temperature | |||
self.threshold = threshold | |||
|
|||
self.device = "cuda" | |||
self.device = self.config.guard_execution_type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just check if cuda is available and use that otherwise use CPU? no need for a specific configuration like this to be added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ashwinb I'll take a look at that and update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
requesting changes for my inline comment
Signed-off-by: Michael Dawson <[email protected]>
@ashwinb updated based on your suggestion. Thanks for taking the time to review my PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool
What does this PR do?
Previously prompt guard was hard coded to require cuda which prevented it from being used on an instance without a cuda support.
This PR allows prompt guard to be configured to use either cpu or cuda.
Closes #2133
Test Plan (Edited after incorporating suggestion)
started stack configured with prompt guard as follows on a system without a GPU
and validated prompt guard could be used through the APIs
validated on a system with a gpu (but without llama stack) that the python selecting between cpu and cuda support returned the right value when a cuda device was available.
ran the unit tests as per - https://github.com/meta-llama/llama-stack/blob/main/tests/unit/README.md